\documentclass{report}

\usepackage{array}
\usepackage{hyperref}
\usepackage{amsmath}
\usepackage{tikz}
\usepackage{pgfplots}

\usepackage{listings}
\lstset{language=Python}

\usepackage{amsthm}
\newtheorem{axiom}{Axiom}[section]
\newtheorem{definition}{Definition}[section]
\newtheorem{theorem}{Theorem}[section]
\newtheorem{corollary}{Corollary}[section]
\newtheorem{lemma}{Lemma}
\newtheorem{remark}{Remark}

\usepackage{tikz}
\tikzset{
  treenode/.style = {shape=rectangle, rounded corners,
                     draw, align=center,
                     top color=white, bottom color=blue!20},
  root/.style     = {treenode, font=\Large, bottom color=red!30},
  env/.style      = {treenode, font=\ttfamily\normalsize},
  dummy/.style    = {circle,draw}
}


\begin{document}

\title{Mathematics}
\author{Yan Xiaohan\\
  \texttt{xiaohan.yan@foxmail.com}}
\date{Jan, 2020}
\maketitle

\tableofcontents

\chapter{Data and Graphs}

\section{Collect and Interpret Data}

\textbf{Data} are pieces of informations that can be gathered through interviews, records of events or questionnaires. Collecting information from an entire group, or \textit{population}, is called a \textbf{census}. To save time and money, many people conduct a survey, or poll.

Surveys and polls provide data about a part, or \textbf{sample}, of the population.

To make decisions about a population based on a sample, choose an appropriate sampling method.
 
\textbf{Random Sampling} Each member of the population has an equal chance of being selected.The members are chosen independently of one another.

\textbf{Convenience Sampling} The sample is chosen only because it is easily available.

\textbf{Systematic Sampling} After a population is ordered in some way, the sample is chosen according to a pattern.

\textbf{Cluster Sampling} Members of the population are chosen at random from a particular part of the population and then polled in clusters.

\section{Measures of Central Tendency snd Range}

Four statistical measures that help you describe a set of data are the mean, median, mode and range.

The \textbf{mean}, or \textbf{arithmetic average}, is the sum of the values divided by the number of items of data.

The \textbf{median} is the middle value when the data are arranged in numerical order. If the number of items of data is odd, the median is the middle value. If the number of items of data is even, add the two middle values and divide by 2.

The \textbf{mode} is the value that occurs most frequently in the set of data. Some sets of data have no mode.Some have more than one mode.

The \textbf{range} is the difference between the greatest and the least values in a set of data.

Because the mean, median, and mode locate centers of a set of data, these terms are called  \textbf{measures of central tendency}.

\subsection{Implementation in Python}

\begin{lstlisting}
In [1]: import statistics as st

In [2]: l = [1, 1, 1, 2, 2, 2, 3, 4]

In [3]: st.mean(l)
Out[3]: 2

In [4]: st.median(l)
Out[4]: 2.0

In [5]: st.mode(l)
Out[5]: 1

In [6]: st.multimode(l)
Out[6]: [1, 2]

In [7]: max(l) - min(l)
Out[7]: 3
\end{lstlisting}

\section{Stem-and-Leaf Plots}

A \textbf{stem-and-leaf plot} organizes and displays data. The last digits of the data values are the \textbf{leaves}. The digits in front of the leaves are the \textbf{stems}.

Data values that are much greater than or much less than most of the other values can be called \textbf{outliers}. \textbf{Clusters} are isolated groups of values. \textbf{Gaps} are large spaces between values.

\subsection{Implementation in Python}

\begin{lstlisting}
import stemgraphic
ages = [23, 45, 54, 41, 28, 23, 42, 46, 58, 31, 24, 43, 46, 37, 25, 39, 41, 24]
stemgraphic.stem_graphic(ages,  median_color = 'red', outlier_color = 'blue',  scale = 10)
\end{lstlisting}

\section{Problem Solving Skills:Circle Graphs}

In a \textbf{circle graph}, or \textit{pie chart}, a visual shows how data is divided into categories that do not overlap. Each \textbf{sector}, or slice, is a percentage of the total number of data. The entire circle represents 100\% of the data. A \textbf{key}, or \textit{legend}, describes the data in each sector. To solve a circle graph problem, it helps to break it into small parts. This strategy is often called \textbf{solve a simpler problem}.

\subsection{Implementation in Python}

\begin{lstlisting}
import matplotlib.pyplot as plt

labels = '\$_\$', '@_@', '>_<', '*^*'
sizes = [215, 130, 245, 210]
colors = ['gold', 'yellowgreen', 'lightcoral', 'lightskyblue']

plt.pie(sizes, labels=labels, colors=colors, autopct='%1.1f%%')

plt.axis('equal')
plt.show()
\end{lstlisting}

\section{Frequency Tables and Pictographs}

A \textbf{frequency table} shows how often an appears in a set of data. A tally mark is used to record each response. The total number of marks for a given response is the \textit{frequency} of that response.

A picture graph, or \textbf{pictograph}, displays data with graphic symbols. The key identifies the number of data items represented by each symbol. The symbols often represent rounded amounts.

To read a pictograph, start by interpreting the key. Then \textit{multiply} the value of one symbol by the number of symbols in a row to find the value for the row.

\section{Bar Graphs and Line Graphs}

In a \textbf{bar graph}, horizontal or vertical bars display data. A scale is used to show intervals. To read a bar graph, look at the top edge of each bar. Match that edge with the number on the scale to find the value of that bar.

On a \textbf{line graph}, points representing data are plotted, then connected with line segments. Because the points are connected in sequence, a line graph shows trends, or changes, in data over a period of time.

To read a line graph, locate the first data point. Relate it to the corresponding labeled points on the vertical and horizontal scales. Do the same for each of the other data points.

\section{Scater Plots and Lines of Best Fit}

A \textbf{scatter plot} displays two sets of related data on the same set of axes. Points represent the data in a scatter plot, but they are not connected. There can be more than one point for any number on either axis.

On some scatter plots, a \textbf{line of best fit}, or \textbf{trend line}, can be drawn near most of the points. A line of best fit that slops up and to the right indicates a positive correlation among the data. A \textbf{positive correlation} means that as the horizontal axis values increase, the vertical axis values tend to increase.

A line of best fit that slopes down and to the right indicates a negative correlation. In a \textbf{negative correlation}, as the horizontal axis values increase, the vertical axis values tend to decrease. It is possible that there is no correlation.
\end{document}

